


TR(1L)            MISC. REFERENCE MANUAL PAGES             TR(1L)



NAME
     tr - translate or delete characters

SYNOPSIS
     tr      [-cst]      [--complement]       [--squeeze-repeats]
     [--truncate-set1] string1 string2
     tr {-s,--squeeze-repeats} [-c] [--complement] string1
     tr {-d,--delete} [-c] string1
     tr {-d,--delete} {-s,--squeeze-repeats} [-c]  [--complement]
     string1 string2

DESCRIPTION
     This manual page documents the GNU version of tr. tr  copies
     the standard input to the standard output, performing one of
     the following operations:

          o+ translate, and optionally squeeze repeated characters
          in the result
          o+ squeeze repeated characters
          o+ delete characters
          o+ delete characters, then squeeze  repeated  characters
          from the result.

     The _s_t_r_i_n_g_1 and (if given) _s_t_r_i_n_g_2 arguments define  ordered
     sets  of  characters,  referred  to  below as set1 and set2.
     These sets are the characters of the input that tr  operates
     on.   The  --_c_o_m_p_l_e_m_e_n_t  (-_c)  option replaces set1 with its
     complement (all of the characters that are not in set1).

  SPECIFYING SETS OF CHARACTERS
     The format of the _s_t_r_i_n_g_1 and  _s_t_r_i_n_g_2  arguments  resembles
     the  format  of  regular  expressions; however, they are not
     regular expressions, only lists of characters.  Most charac-
     ters  simply  represent themselves in these strings, but the
     strings can contain the shorthands listed  below,  for  con-
     venience.   Some  of  them  can  be  used only in _s_t_r_i_n_g_1 or
     _s_t_r_i_n_g_2, as noted below.

     Backslash excapes.  A backslash followed by a character  not
     listed below causes an error message.

     \a   Control-G.

     \b   Control-H.

     \f   Control-L.

     \n   Control-J.

     \r   Control-M.

     \t   Control-I.



Sun Release 4.1           Last change:                          1






TR(1L)            MISC. REFERENCE MANUAL PAGES             TR(1L)



     \v   Control-K.

     \ooo The character with the value given by _o_o_o, which  is  1
          to 3 octal digits.

     \\   A backslash.

     Ranges.  The notation `_m-_n' expands to all of the characters
     from  _m  through  _n,  in  ascending order.  _m should collate
     before _n; if it doesn't, an error results.  As  an  example,
     `0-9' is the same as `0123456789'.  Ranges can optionally be
     enclosed in square brackets, which has no effect but is sup-
     ported  for  compatibility with historical System V versions
     of tr.

     Repeated  characters.   The  notation  `[_c*_n]'  in   _s_t_r_i_n_g_2
     expands  to  _n  copies of character _c.  Thus, `[y*6]' is the
     same as `yyyyyy'.  The notation `[_c*]' in _s_t_r_i_n_g_2 expands to
     as  many  copies  of _c as are needed to make set2 as long as
     set1.  If _n begins with a 0, it  is  interpreted  in  octal,
     otherwise in decimal.

     Character classes.  The notation `[:_c_l_a_s_s-_n_a_m_e:]' expands to
     all  of  the  characters  in  the  (predefined)  class named
     _c_l_a_s_s-_n_a_m_e.  The characters expand in no  particular  order,
     except  for the `upper' and `lower' classes, which expand in
     ascending   order.    When    the    --_d_e_l_e_t_e    (-_d)    and
     --_s_q_u_e_e_z_e-_r_e_p_e_a_t_s (-_s) options are both given, any character
     class can be used in _s_t_r_i_n_g_2.  Otherwise, only the character
     classes  `lower'  and  `upper'  are accepted in _s_t_r_i_n_g_2, and
     then only if the corresponding character class (`upper'  and
     `lower',  respectively)  is  specified  in the same relative
     position in _s_t_r_i_n_g_1.  Doing this specifies case  conversion.
     The  class  names  are given below; an error results when an
     invalid class name is given.

     alnum
          Letters and digits.

     alpha
          Letters.

     blank
          Horizontal whitespace.

     cntrl
          Control characters.

     digit
          Digits.

     graph



Sun Release 4.1           Last change:                          2






TR(1L)            MISC. REFERENCE MANUAL PAGES             TR(1L)



          Printable characters, not including space.

     lower
          Lowercase letters.

     print
          Printable characters, including space.

     punct
          Punctuation characters.

     space
          Horizontal or vertical whitespace.

     upper
          Uppercase letters.

     xdigit
          Hexadecimal digits.

     Equivalence classes.  The syntax `[=_c=]' expands to  all  of
     the  characters  that  are equivalent to _c, in no particular
     order.  Equivalence classes are a recent invention  intended
     to  support non-English alphabets.  But there seems to be no
     standard way to define them  or  determine  their  contents.
     Therefore,  they  are  not fully implemented in GNU tr; each
     character's equivalence class consists only of that  charac-
     ter, which makes this a useless construction currently.

  TRANSLATING
     tr performs translation when _s_t_r_i_n_g_1 and  _s_t_r_i_n_g_2  are  both
     given  and  the  --delete  (-_d)  option  is  not  given.  tr
     translates each character of its input that is  in  set1  to
     the corresponding character in set2.  Characters not in set1
     are passed through unchanged.  When a character appears more
     than  once  in set1 and the corresponding characters in set2
     are not all the same, only the final one is used.  For exam-
     ple, these two commands are equivalent:
          tr aaa xyz
          tr a z

     A common use of tr is to  convert  lowercase  characters  to
     uppercase.   This  can be done in many ways.  Here are three
     of them:
          tr abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ
          tr a-z A-Z
          tr '[:lower:]' '[:upper:]'

     When tr is performing translation, set1 and set2 should nor-
     mally  have  the same length.  If set1 is shorter than set2,
     the extra characters at the end of set2 are ignored.




Sun Release 4.1           Last change:                          3






TR(1L)            MISC. REFERENCE MANUAL PAGES             TR(1L)



     On the other hand, making set1 longer than set2 is not port-
     able;  POSIX.2  says  that the result is undefined.  In this
     situation, the BSD tr pads set2 to the  length  of  set1  by
     repeating the last character of set2 as many times as neces-
     sary.  The System V tr truncates set1 to the length of set2.

     By default, GNU tr handles this case like the BSD  tr  does.
     When  the  --truncate-set1 (-_t) option is given, GNU tr han-
     dles this case like the System V tr instead.  This option is
     ignored for operations other than translation.

     Acting like the System V tr in this case  breaks  the  rela-
     tively common BSD idiom:
          tr -cs A-Za-z0-9 '\012'
     because it converts only zero bytes (the  first  element  in
     the  complement of set1), rather than all non-alphanumerics,
     to newlines.

  SQUEEZING REPEATS AND DELETING
     When given just the --delete (-_d)  option,  tr  removes  any
     input characters that are in set1.

     When  given  just  the  --squeeze-repeats  (-_s)  option,  tr
     replaces each input sequence of a repeated character that is
     in set1 with a single occurrence of that character.

     When given  both  the  --delete  and  the  --squeeze-repeats
     options,  tr  first  performs any deletions using set1, then
     squeezes repeats from any remaining characters using set2.

     The --squeeze-repeats option may also be used when translat-
     ing,  in  which  case  tr  first  peforms  translation, then
     squeezes repeats from any remaining characters using set2.

     Here are some examples to illustrate various combinations of
     options:

     Remove all zero bytes:
          tr -d '\000'

     Put all words on lines by  themselves.   This  converts  all
     non-alphanumeric  characters to newlines, then squeezes each
     string of repeated newlines into a single newline:
          tr -cs '[a-zA-Z0-9]' '[\n*]'

     Convert each sequence of repeated newlines to a single  new-
     line:
          tr -s '\n'

  WARNING MESSAGES
     Setting the environment variable POSIXLY_CORRECT  turns  off
     several  warning  and  error messages, for strict compliance



Sun Release 4.1           Last change:                          4






TR(1L)            MISC. REFERENCE MANUAL PAGES             TR(1L)



     with POSIX.2.  The messages normally occur in the  following
     circumstances:

     1.  When the --_d_e_l_e_t_e option is given but  --_s_q_u_e_e_z_e-_r_e_p_e_a_t_s
     is  not,  and  _s_t_r_i_n_g_2  is given, GNU tr by default prints a
     usage message and exits, because _s_t_r_i_n_g_2 would not be  used.
     The POSIX specification says that _s_t_r_i_n_g_2 must be ignored in
     this case.  Silently ignoring arguments is a bad idea.

     2.  When an ambiguous octal escape is given.   For  example,
     \400  is  actually  \40 followed by the digit 0, because the
     value 400 octal does not fit into a single byte.

     Note that GNU tr does not provide complete BSD or  System  V
     compatibility.   For  example, there is no option to disable
     interpretation of the POSIX constructs [:alpha:], [=c=], and
     [c*10].   Also,  GNU tr does not delete zero bytes automati-
     cally, unlike traditional UNIX versions,  which  provide  no
     way to preserve zero bytes.

     The long-named options can be introduced with `+' as well as
     `--',  for compatibility with previous releases.  Eventually
     support for `+' will be removed, because it is  incompatible
     with the POSIX.2 standard.































Sun Release 4.1           Last change:                          5



